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Abstract — As one of the recently proposed algorithms for 
sparse system identification, lo norm constraint Least Mean 
Square (Zo-LMS) algorithm modifies the cost function of the 
traditional method with a penalty of tap-weight sparsity. The 
performance of /o-LMS is quite attractive compared with its 
various precursors. However, there has been no detailed study of 
its performance. This paper presents all-around and throughout 
theoretical performance analysis of Zo-LMS for white Gaussian 
input data based on some reasonable assumptions. Expressions 
for steady-state mean square deviation (MSD) are derived and 
discussed with respect to algorithm parameters and system 
sparsity. The parameter selection rule is established for achiev- 
ing the best performance. Approximated with Taylor series, 
the instantaneous behavior is also derived. In addition, the 
relationship between /o-LMS and some previous arts and the 
sufficient conditions for /q-LMS to accelerate convergence are 
set up. Finally, all of the theoretical results are compared with 
simulations and are shown to agree well in a large range of 
parameter setting. 

Index Terms — adaptive filter, sparse system identification, lo- 
LMS, mean square deviation, convergence rate, steady-state 
misalignment, independence assumption, white Gaussian signal, 
performance analysis. 



I. Introduction 

Adaptive filtering has attracted mucii researchi interest in 
both theoretical and applied issues for a long time H]- 
131 . Due to its good performance, easy implementation, and 
high robustness. Least Mean Square (LMS) algorithm |[T]-||4] 
has been widely used in various applications such as system 
identification, channel equalization, and echo cancelation. 

The unknown systems to be identified are sparse in most 
physical scenarios, including the echo paths |[5] and Digital 
TV transmission channels IS]. In other words, there are only a 
small number of non-zero entries in the long impulse response. 
For such systems, the traditional LMS has no particular gain 
since it never takes advantage of the prior sparsity knowledge. 
In recent years, several new algorithms have been proposed 
based on LMS to utilize the feature of sparsity. M-Max 
Normalized LMS (MMax-NLMS) f7l and Sequential Partial 
Update LMS (S-LMS) |8J decrease the computational cost and 
steady-state mean squared error (MSB) by means of updating 
filter tap-weights selectively. Proportionate NLMS (PNLMS) 
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and its improved version ||5], lH) accelerate the convergence by 
setting the individual step size in proportional to the respective 
filter weights. 

Sparsity in adaptive filtering framework has been a long 
discussed topic ITOl . lITTI . Inspired by the recently appeared 
sparse signal processing branch llT2l - ||20l , especially com- 
pressive sampling (or compressive sensing, CS) 11211 - 11231 . a 
family of sparse system identification algorithms has been 
proposed based on Ip norm constraint. The basic idea of 
such algorithms is to exploit the characteristics of unknown 
impulse response and to exert sparsity constraint on the 
cost function of gradient descent. Specially, ZA-LMS lfT2l 
utilizes li norm and draws the zero-point attraction to all tap- 
weights. ^o-LMS IIT3I employs a non-convex approximation 
of Iq norm and exerts respective attractions to zero and non- 
zero coefficients. The smoothed Iq algorithm, which is also 
based on an approximation of Iq norm, is proposed in ||24| 
and analyzed in ||25l . Besides LMS variants, RLS-based sparse 
algorithms lfT4l . ifTSl and Bayesian-based sparse algorithms 
1261 have also been proposed. 



It is necessary to conduct a theoretical analysis for Zq-LMS 
algorithm. Numerical simulations demonstrate that the men- 
tioned algorithm has rather good performance compared with 
several available sparse system identification algorithms 1T3JI . 
including both accelerating the convergence and decreasing 
the steady-state MSD. /q-LMS performs zero-point attraction 
to small adaptive taps and pulls them toward the origin, 
which consequently increases their convergence speed and 
decreases their steady-state bias. Because most coefficients of 
a sparse system are zero, the overall identification performance 
is enhanced. It is also found that the performance of /q-LMS 
is highly affected by the predefined parameters. Improper 
parameter setting could not only make the algorithm less 
efficient, but also yield steady-state misalignment even larger 
than the traditional algorithm. The importance of such analysis 
should be further emphasized since adaptive filter framework 
and /q-LMS behave well in the solution of sparse signal 
recovery problem in compressive sensing Il27l . Compared 
with some convex relaxation methods and greedy pursuits 
l28ll - l30l . it was experimentally demonstrated that /g-LMS in 
adaptive filtering framework shows more robustness against 
noise, requires fewer measurements for perfect reconstruc tion, 
and recovers signal with less sparsity. Considering its impor- 
tance as mentioned above, the steady-state performance and 
instantaneous behavior of /q-LMS are throughout analyzed in 
this work. 
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A. Main contribution 

One contribution of this work is on steady-state performance 
analysis. Because of the nonlinearity caused by the sparsity 
constraint in /q-LMS, the theoretical analysis is rather difficult. 
To tackle this problem and enable mathematical tractabil- 
ity, adaptive tap-weights are sorted into different categories 
and several assumptions besides the popular independence 
assumption are employed. Then, the stability condition on step 
size and steady-state misalignment are derived. After that, the 
parameter selection rule for optimal steady-state performance 
is proposed. Finally, The steady-state MSD gain is obtained 
theoretically of ^o-LMS over the tradition algorithm, with the 
optimal parameter. 

Another contribution of this work is on instantaneous be- 
havior analysis, which indicates the convergence rate of LMS 
type algorithms and also arouses much attention ||3T1 - Il33l . 
For LMS and most of its linear variants, the convergence 
process can be obtained in the same derivation procedure 
as steady-state misalignment. However, this no longer holds 
for ^o-LMS due to its nonlinearity. In a different way by 
utilizing the obtained steady-state MSD as foundation, a Taylor 
expansion is employed to get an approximated quantitative 
analysis of the convergence process. Also, the convergence 
rates are compared between /q-LMS and standard LMS. 



B. Relation to other works 

In order to theoretically characterize the performance and 
guide the selection of the optimal algorithm parameters, the 
mean square analysis has been conducted for standard LMS 
and a lot of its variants. To the best of our knowledge, Widrow 
for the first time proposed the LMS algorithm in ll34l and 
studied its performance in ll35l . Later, Horowitz and Senne 
ll36l established the mathematical framework for mean square 
analysis via studying the weight vector covariance matrix and 
achieved the closed-form expression of MSB, which was fur- 
ther simplified by Feuer and Weinstein |[37l . The mean square 
performance of two variants, leaky LMS and deficient length 
LMS, were theoretically investigated in similar methodologies 
in II3TI and 1321 . respectively. Recently, Dabeer and Masry 
1331 put forward a new approach for performance analysis on 
LMS without assuming a linear regression model. Moreover, 
convergence behavior of transform-domain LMS was studied 
in l38l with second-order autoregressive process. A summa- 
rized analysis was proposed in ||39l on a class of adaptive 
algorithms, which performs linear time-invariant operations on 
the instantaneous gradient vector and includes LMS as the 
simplest case. Similarly, the analysis of Normalized LMS has 
also attracted much attention l40l . BTI . 

However, the methodologies mentioned above, which are 
effective in their respective context, could no longer be directly 
applied to the analysis of ^g-LMS, considering its high non- 
linearity. Admittedly, nonlinearity is a long topic in adaptive 
filtering and not unique for /q-LMS itself. Researchers have 
delved into the analysis of many other LMS-based nonlinear 
variants l42l - ll50l . Nevertheless, the nonlinearity of most 
above references comes from non-linear operations on the 



estimated error, rather than the adaptive tap-weights that Iq- 
LMS mainly focuses on. 

We have noticed that the mean square deviation analysis of 
ZA-LMS has been conducted l46l . However, this work is far 
different from the reference. First of all, the literature did not 
consider the transient performance analysis while in this work 
the mean square behavior of both steady-state and convergence 
process are conducted. Moreover, considering Zq-LMS is more 
sophisticated than ZA-LMS, there are more parameters in 
Zq-LMS than in ZA-LMS, which enhances the algorithm 
performance but increases the difficulty of theoretical analysis. 
Last but not least, taking its parameters to a specific limit 
setting, /q-LMS becomes essentially the same as ZA-LMS, 
which can apply the theoretical results of this work directly. 

A preliminary version of this work has been presented in 
conference ISTl . including the convergence condition, deriva- 
tion of steady-state MSD, and an expression of the optimal 
parameter selection. This work provides not only a detailed 
derivation for steady-state results, but also the mean square 
convergence analysis. Moreover, both the steady-state MSD 
and the parameter selection rule are further simplified and 
available for analysis. Finally, more simulations are performed 
to validate the results and more discussions are conducted. 

This paper is organized as follows. In section |II] a brief 
review of Zq-LMS and ZA-LMS is presented. Then in section 
Hill a few reasonable assumptions are introduced. Based on 
these assumptions, section HV] proposes the mean square anal- 
ysis. Numerical experiments are performed to demonstrate the 
theoretical derivation in section |V] and the conclusion is drawn 
in section IVTl 

II. Background 

A. Iq-LMS algorithm 

The unknown coefficients and input signal at time in- 
stant 71 are denoted by s = [sq, si, . . . , sl-i] and x„ = 
[xn,Xn-i, ■ ■ ■ ,Xn-L+i] , respectively, where L is the filter 
length. The observed output signal is 



T 

x„ s 



(1) 



where Vn denotes the additive noise. The estimated error 
between the output of unknown system and of the adaptive 
filter is 



^n ^n ^n ^^n 7 



(2) 



where w„ = [wo^njWi^n, ■ ■ ■ ,WL-i,n] denotes the adaptive 
filter tap- weights. 

In order to take the sparsity of the unknown coefficients 
into account, ^q-LMS |(13l| inserts an Iq norm penalty into the 
cost function of standard LMS. The new cost function is 

^n = e^ + 7l|w„||o, 

where 7 > is a factor to balance the estimation error and the 
new penalty. Due to the NP hardness of Iq norm optimization, 
a continuous function is usually employed to approximate Iq 
norm. Taking the popular approximation [52) and making use 
of the first order Taylor expansion, the recursion of ^q-LMS is 



"n+l 



+ ^e„x„ + K5(w„), 



(3) 
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where g(w„) = [giwo^„), g{w i,n), ■ ■ ■ ,g{wL 



-i,n)]^ and 



9it) 



2aH ~ 2a ■ sgn{t) \t\<l/a] 
elsewhere. 



(4) 



The last item in Q is called zero-point attraction ifTSll . 1271 . 
because it reduces the distance between Wi,n and the origin 
when \wi^n\ is small. According to (|4]i and Fig.llla), obviously 
such attractor is non-linear and exerts varied affects on respec- 
tive tap-weights. This attractor is effective for the tap-weights 
in the interval (— l/a, 1/a), which is named attraction range. 
In this region, the smaller {wi^nl is, the stronger attraction 
affects. 



B. ZA-LMS and RZA-LMS 

ZA-LMS (or Sparse LMS) |[12| runs similar as ^q-LMS. 
The only difference is that the sparse penalty is changed to li 
norm. Accordingly the zero-point attraction item of the former 
is defined as 

g^^it) - -sgn(t), (5) 

which is shown in Fig. [itb)- The recursion of ZA-LMS is 



w„+i = w„ + /ie„x„ + pg (x„) 



(6) 



where p is the parameter to control the strength of sparsity 
penalty. Comparing the sub figures in Fig. [1] one can readily 
accept that g{t) exerts the various attraction to respective tap- 
weight, therefore it usually behaves better than g^^(f). In the 
following analysis, one will read that ZA-LMS is a special case 
of io-LMS and the result of this work can be easily extended 
to the case of ZA-LMS. 

As its improvement, Reweighted ZA-LMS (RZA-LMS) is 
also proposed in [[T21 . which modifies the zero-point attraction 

term to 

sgn(t) 



,RZA 



g 



W-- 



l + e|t|' 



(7) 



where parameter e controls the similarity between (|2) and Iq 
norm. Please refer to Fig. [TJc) for better understanding the 
behavior of (|7]i. In section V, both ZA-LMS and RZA-LMS 
are simulated for the purpose of performance comparison. 

C. Previous results on LMS and ZA-LMS 

Denote D^^^ and D^f^^ as the steady-state MSD and 
instantaneous MSD after n iterations for LMS with zero- 
mean independent Gaussian input, respectively. The steady- 
state MSD has the explicit expression [jj) of 



D 



LMS 



piPvL 



pPvL 



2 



MP.(i + 2) 



(8) 



where P^ and P„ denote the power of input signal and additive 
noise, respectively, and Al is a constant defined by ( |25] | 
in Appendix |A] For the convergence process, the explicit 
expression of instantaneous MSD is implied in ll36l as 



D 



LMS 






s U - 



p.PvL 



[I-pP^AlT. (9) 
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2a 
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(a) 



(b) 




(c) 
Fig. L The zero-point attraction of (a) Zq-LMS, (b) ZA-LMS, (c) RZA-LMS. 

Next one turns to ZA-LMS, -D^ is used to denote the 
steady-state MSD with white Gaussian input. Reference P6| 
reaches the conclusion that 
2 
/^ 
where y is the solution to 



D 



ZA 



2 npP^ + Ao 2 



ALy' + {L-Q)pJ^y 



2„2 



A^P 



27rp^PS 
fL_2Q 
pP.Ao = 0, 



P.' 



(10) 



27r 



1 



IJ-Px 



irp^p^ 

where Q < L denotes the number of non-zero unknown 
coefficients and Ao is a constant defined by dZTl l. 

D. Related steepest ascent algorithms for sparse decomposi- 
tion 

/q-LMS employs steepest descent recursively and is appli- 
cable to solving sparse system identification. More generally, 
steepest ascent iterations are used in several algorithms in 
the field of sparse signal processing. For example, researchers 
developed smoothed Iq method 1241 for sparse decomposition, 
whose iteration includes a steepest ascent step and a projection 
step. The first step is defined as 



-1 = w„ 



A-f°, 



SLO 



,,SLO „,SLO 



(11) 

„SLO IT 



where p serves as step size, v, 

denotes the negative derivative to an approximated Iq norm 

and takes the value 



SLO 



-Wfc^nexp (~2wl,Ja^) , < fc < L. 



After ( fTTT l. a projection step is performed which maps w,i+i 
to Wn+i in the feasible set. It can be seen that ( fTTI ) performs 
steepest ascent, which is similar to zero-point attraction in l^- 
LMS. The iteration details and performance analysis of this 
algorithm are presented in [[24] and li25l , respectively. 
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Another algorithm, named Iterative Bayesian 
enjoys steepest ascent iteration as 

dL 



w„+i = W,; 



M 



dw' 



also 



(12) 



where ^ denotes the step size and L is a log posterior proba- 
bility function. Analysis of this algorithm and its application 
to sparse component analysis in noisy scenario are presented 
in lEi. 

III. Preliminaries 

Considering the nonlinearity of zero-point attraction, some 
preparations are made to simplify the mean square perfor- 
mance analysis. 

A. Classification of unknown coefficients 

Because various affects are exerted in /q-LMS to the filter 
tap-weights according to their respective system coefficients, 
it would be helpful to classify the unknown parameters, 
correspondingly, the filter tap-weights, into several categories 
and perform different analysis on each category separately. 
According to the attraction range and their strength, all system 
coefficients are classified into three categories as 



Large coefRcients : 

Small coefRcients : 

Zero coefRcients : 



Cl = {fc||sfe| >l/a}; 
Cs = {A:|0<|sfc| <l/a}; 
Co = {k\sk = 0} , 

where < k < L. Obviously, |Cl U Cs U Co| = L and \Cl U 
Cs\ =■ Q- In the following text, derivations are firstly carried 
out for the three sets separately. Then a synthesis is taken to 
achieve the final results. 

B. Basic assumptions 

The following assumptions about the system and the prede- 
fined parameters are adopted to enable the formulation. 
(i) Input data x{n) is an i.i.d. zero-mean Gaussian signal, 
(ii) Tap-weights w„, input vector x„, and additive noise u„ 

are mutually independent, 
(iii) The parameter k is so small that 2a^K ^ jJ-Px- 

Assumption (i) commonly holds while (ii) is the well-known 
independence assumption 13]. Assumption (iii) comes from the 
experimental observations, i.e., a too large k can cause much 
bias as well as large steady-state MSD. Therefore, in order to 
achieve better performance, k, should not be too large. 

Besides the above items, several regular patterns are sup- 
posed during the convergence. 

(iv) All tap-weights, w„, follow Gaussian distribution. 
(v) For fc S Cl U Cs, the tap-weight Wfc „ is assumed to 
have the same sign with the corresponding unknown 
coefficient, 
(vi) The adaptive weight Wk,n is assumed out of the attraction 
range for k £ Cl, while in the attraction range elsewhere. 
Assumption (iv) is usually accepted for steady-state behav- 
ior analysis lfT2l . B9l . The rationality of assumption (v) and 
(vi) comes from two aspects. First, there are few taps violating 
these assumptions in a common scenario. Intuitively, only the 



non-zero taps with rather small absolute value may violate 
assumption (v), while assumption (vi) may not hold for the 
taps close to the boundaries of the attraction range. For other 
taps which make up the majority, these assumptions are usually 
reasonable, especially in high SNR cases. Second, assumptions 
(v) and (vi) are proper for small steady-state MSD, which is 
emphasized in this work. The smaller steady-state MSD is, the 
less tap-weights differ from unknown coefficients. Therefore, 
it is more likely that they share the same sign, as well as on 
the same side of the attraction range. 

Based on the discussions above, those patterns can be 
adopted in steady state. For the convergence process, due to 
fast convergence of LMS-type algorithms, it is also reasonable 
to suppose that most taps will get close to the corresponding 
unknown coefficients very quickly, which indicates the validity 
of these patterns in common scenarios. As we will see later, 
some of the above assumptions cannot always hold in whatever 
parameter setting and may restrict the applicability of some 
analysis below. However, considering the difficulties of non- 
linear algorithm performance analysis, these assumptions can 
significantly enable mathematical tractability and help obtain 
results shown to be precious in a large range of parameter 
setting. Thus, we consider these assumptions reasonable to be 
employed in this work. 

IV. Performance analysis 

Based on the assumptions above, the mean and mean-square 
performances of /q-LMS are analyzed in this section. 

A. Mean performance 

Define the misalignment vector as h„ = w„ — s, combine 
0, ©, and ^, one has 



In+l 



(I - /ix„x^) h„ + /^v„x„ 



+ «;.g(w„). (13) 



Taking expectation and using the assumption (ii), one derives 



IIpA'^-^^ 



where overline denotes expectation. 



For k € Cl, utilizing assumption (vi), one has g(wk,oo) = 

0. 

For k G Cs, combining assumptions (iii), (v) and (vi), it 

can be derived that 

2a'^K\ Kg{sk) Kgisk) 



hky 



1 



^iPx J fiPx IJ-Px 

• For k S Co, noticing the fact that g{x) has the opposite 
sign with x in interval (— 1/a, 1/a) and using assump- 
tions (iv) and (vi), it can be derived that g{wk,oo) — 0. 
Thus, the bias in steady state is obtained 



^fc,oo 



1 elsewhere. 



keCs; 



(14) 



In steady state, therefore, the tap-weights are unbiased for 
large coefficients and zero coefficients, while they are biased 
for small coefficients. The misalignment depends on the pre- 
defined parameters as well as unknown coefficient Sk itself. 
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The smaller the unknown coefficient is, the larger the bias 
becomes. This tendency can be directly read from Fig. [TJa)- 
In the attraction range, the intensity of the zero-point attraction 
increases as tap-weights get more closing to zero, which 
causes heavy bias. Thus, the bias of small coefficients in steady 
state is the byproduct of the attraction, which accelerates the 
convergence rate and increases steady-state MSD. 

B. Mean square steady-state performance 

The condition on mean square convergence and steady-state 
MSD are given by the following theorem. 

Theorem 1: In order to guarantee convergence, step-size /i 
should satisfy 

2 



< fi< p„ 



and the final mean square deviation of ^o-LMS is 

flPyL 



D^ 



Al 



+ /3ik^-^2kVk2+,33, 



(15) 



(16) 



in Appendix lAl respec- 



where {/3i} are defined in 
tively. 

The proof of Theorem [T] goes in Appendix |B] 
Remark 1: The steady-state MSD of Zo-LMS is composed 
of two parts: the first item in (fTST i is exactly the steady- 
state MSD of standard LMS (O, while the latter two items 
compose an additional part caused by zero-point attraction. 
When K equals zero, /q-LMS becomes the traditional LMS, 
and correspondingly the additional part vanishes. When the 
additional part is negative, /q-LMS has smaller steady-state 
MSD and thus better steady-state performance over standard 
LMS. Consequently, it can be deduced that the condition on 
K to ensure Zq-LMS outperforms LMS in steady-state is 



< K < 



/3I/33 



Pi - Pi 

Remark 2: According to Theorem[T] the following corollary 
on parameter n is derived. 

Corollary 1: From the perspective of steady-state perfor- 
mance, the best choice for k is 




"i^opt 



and the minimum steady-state MSD is 



7-jmin 



l-iPyL /?3 

A. + 2 



Pi - Pi - Pi 



(17) 



(18) 



The proof of Corollary [T]is presented in Appendix ICJ Please 
notice that in (fTSl l. the first item is about standard LMS and 
the second one is negative when Q is less than L. Therefore, 
the minimum steady-state MSD of Zg-LMS is less than that of 
standard LMS as long as the system is not totally non-sparse. 

Remark 3: According to the theorem, it can be accepted 
that the steady-state MSD is not only controlled by the 
predefined parameters, but also dependent on the unknown 
system in the following two aspects. First, the sparsity of the 
system response, i.e. Q and L, controls the steady-state MSD. 



Second, significantly different from standard LMS, the steady- 
state MSD is relevant to the small coefficients of the system, 
considering the attracting strength appears in /3o and /3i. 

Here we mainly discuss the effect of system sparsity as well 
as the distribution of coefficients on the minimum steady-state 
MSD. Based on the above results, the following corollary can 
be deduced. 

Corollary 2: The minimum steady-state MSD of ( fTSl ) is 
monotonic increasing with respect to Q and attracting strength 
G(s). 

The validation of Corollary [2] is performed in Appendix 
|E] The zero-point attractor is utilized in Zq-LMS to draw 
tap-weights towards zero. Consequently, the more sparse the 
unknown system is, the less steady-state MSD is. Similarly, 
small coefficients are biased in steady state and deteriorate 
the performance, which explains that steady-state MSD is 
increasing with respect to G{s). 

Remark 4: According to (flST l. one knows that Zq-LMS has 
the same convergence condition on step size as standard LMS 
and ZA-LMS 1461. Consequently the effect of fi on steady- 
state performance is analyzed. It is indicated in dU that the 
standard LMS enhances steady-state performance by reducing 
step size ||2l- /q-LMS has a similar trend. For the seek of 
simplicity and practicability, a sparse system of Q far less 
than L is considered to demonstrate this property. Utilizing 
(flSl l in such scenario, the following corollary is derived. 

Corollary 3: For a sparse system which satisfies 



Q < L and (Q + 2)/iP^ < 2, 



(19) 



the minimum steady-state MSD in ( fTSl ) is further approxi- 
mately simphfied as 



d: 



pPyL ) 

Ar 



V6 



m + m+\/'^i + 



32a^L 



G(s) 



(20) 



where 775 and jyg are defined by (|42] | in Appendix lAl and G(s), 
defined by ( |29] l, denotes the attracting strength to the zero- 
point. Furthermore, the minimum steady-state MSD increases 
with respect to the step size. 

The proof of Corollary |3]is conducted in Appendix iDl Due 
to the stochastic gradient descent and zero-point attraction, 
the tap-weights suffer oscillation, even in steady state, whose 
intensity is directly relevant to the step size. The larger the 
step size, the more intense the vibration. Thus, the steady- 
state MSD is monotonic increasing with respect to /_i in the 
above scenario. 

Remark 5: In the scenario where 2aK — p remains a 
constant while a approaches to zero, it can be readily accepted 
that (O becomes totally identical to (|6]l, therefore Zq-LMS 
becomes ZA-LMS in this limit setting of parameters. In 
Appendix |F] it is shown that the result (fTol i for steady-state 
performance IIT2I could be regarded as a particular case of 
Theorem[T] As a approaches to zero in /q-LMS, the attraction 
range becomes infinity and all non-zero taps belong to small 
coefficients which are biased in steady state. Thus, ZA-LMS 
has larger steady-state MSD than /q-LMS, due to bias of all 
taps caused by uniform attraction intensity. If k is further 
chosen optimal, the optimal parameter for ZA-LMS is given 
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by Popt = linia-i-o 2aKopt (notice that Kopt approaches od as 
a tending to zero, as makes popt finite), and the minimum 
steady-state MSD of ^q-LMS ([T|]i converges to that of ZA- 
LMS. To better compare the three algorithms, the steady-state 
MSDs of LMS, ZA-LMS, and Iq-LMS are Usted in TABLE U 
where that of ZA-LMS is rewritten and F is defined in ( l64l i 
in Appendix 10 It can be accepted that the steady-state MSDs 
of both ZA-LMS and /q-LMS are in the form of D^^ plus 
addition items, where D]^^ denotes the steady-state MSD of 
standard LMS. If the additional items are negative, ZA-LMS 
and ^o-LMS exceed LMS in steady-state performance. 

Remark 6: Now the extreme case that all taps in system are 
zero, i.e. Q = 0, is considered. If k is set as the optimal, ( fTSl l 
becomes 



rjmin 



2fiP,,LAl 



^J■PvL 

Al 2AL^l + TTfiP^Al' 



(21) 



Due to the independence of jTH on a, this result also holds 
in the scenario of a approaching zero; thus, (l2ll also applies 
for the steady-state MSD of ZA-LMS with optimal p, in the 
extreme case Q = 0. Thus, it has been shown that ^q-LMS and 
ZA-LMS with respective optimal parameters have the same 
steady state performance for a system with all coefficients 
zero. Although this result seems a little strange at the first 
sight, it is in accordance with intuition considering the zero- 
point attraction item in /q-LMS. Since the system only has 
zero taps, all Wk^oo only vibrate in a very small region around 
zero. The zero-point attraction item is Kg{t) w — 2aKSgn(t) 
when t is very near zero, thus as long as we set an to be 
constant, the item mentioned above and the steady state MSD 
have little dependence on a itself. Thus, when k is chosen as 
optimal and Q = 0, the steady state MSD generally does not 
change with respect to a. 

C. Mean square convergence behavior 

Based on the results achieved in steady state, the conver- 
gence process can be derived approximately. 

Lemma 1: The instantaneous MSD is the solution to the 
first order difference equations 



Dn+l 



D„ 

nr, 



(22) 



no 


= 


S 2 





where r2„ = Sfcec ^fc n' vector b„ and constant matrix A 
are defined in i3% and yTJ in Appendix lAl respectively. Initial 
values are 

rn„i riioii2i 

(23) 

The derivation of Lemma [T] goes in Appendix |G] Since 
uj, which is defined by ( |56l ), appears in both A and b„, the 
convergence process is affected by algorithm parameters, the 
length of system, the number of non-zero unknown coeffi- 
cients, and the strength or distribution of small coefficients. 
Moreover, derivation in Appendix |H]yields the solution to i 
in the following theorem. 

Theorem 2: The closed form of instantaneous MSD is 



Dr, 



ClXl+C2\2+C^>^3+Do 



(24) 



where Ai and A2 are the eigenvalues of matrix A, ci and C2 
are coefficients defined by initial values (|23l l. The expressions 
of constants A3 and C3 are listed in ( [33] l and ( [34l i in Appendix 
lAl respectively. D^o denotes the steady-state MSD. 

The two eigenvalues can be easily calculated. Through the 
method of undetermined coefficients, ci and C2 are obtained 
by satisfying initial values Do and Di, which is acquired by 
(|22] | and (l23T l. Considering the high complexity of their closed 
form expressions, they are not included in this paper for the 
sake of simplicity. 

Next we discuss the relationship of mean square conver- 
gence between Zg-LMS and standard LMS. In the scenario 
where /q-LMS with zero k, becomes traditional LMS, it can be 
shown after some calculation that ci = C2 = in (|24] |. which 
becomes in accordance with (|9]l. Now we turn to the MSD 
convergence rate of these two algorithms. From the perspective 
of step size, one has the following corollary. 

Corollary 4: A sufficient condition for that /q-LMS finally 
converges more quickly than LMS is /imax/2 < /i < Pmax, 
where /i,„ax is defined in dTsl l. 

The proof is postponed to Appendix U From Corollary 3] 
one knows that for a large step size, the convergence rate 
of /q-LMS is finally faster than that of LMS. However, this 
condition is not necessary. In fact, /q-LMS can also have faster 
convergence rate for small step size, as shown in numerical 
simulations. 

On the perspective of the system coefficients distribution, 
one has another corollary. 

Corollary 5: Another sufficient condition to ensure that /q- 
LMS finally enjoys acceleration is 

Cs ^%, or equivalently, a > max - — -. 

This corollary is obtained from the fact that C3 equals zero in 
this condition, using the similar proof in Appendix |I] The full 
demonstration is omitted to save space. Therefore, for sparse 
systems whose most coefficients are exactly zeros, a large 
enough a guarantees faster convergence rate finally. Similar as 
above, this condition is also not necessary. /q-LMS converges 
rather fast even if such condition is violated. 

V. Numerical experiments 

Five experiments are designed to confirm the theoretical 
analysis. The non-zero coefficients of the unknown system 
are Gaussian variables with zero mean and unit variance and 
their locations are randomly selected. Input signal and additive 
noise are white zero mean Gaussian series with various signal- 
to-noise ratio. Simulation results are the averaged deviation of 
100 independent trials. For theoretical calculation, the expec- 
tation of attracting strength in (|29] l and (l30t are employed 
to avoid the dependence on priori knowledge of system. The 
parameters of these experiments are listed in TABLE HI] where 
Kopt is calculated by iVU . 

In the first experiment, the steady-state performance with 
respect to k is considered. Referring to Fig. |2] the theoretical 
steady-state MSD of /q-LMS is in good agreement with the 
experiment results when SNR is 40dB. With the growth of 
K from 10^^, the steady-state MSD decreases at first, which 
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TABLE 1 

The steady-state MSDs of three algorithms. 



Algorithm 


Relation with Zq-LMS 








Steady-state MSD 




Eq. No. 




Denotation 


Expression 




;o-LMS 

ZA-LMS 

LMS 


2qk = 


= p and a — >■ 

K = 






Doo 
^LMS 


nLMS p{L-Q)VT p2{2{L-Q)AoAQ-|-7rAj,{MiPx 


+2QAo)) 


TABLE 11 

The parameters in experiments. 


Experiment L 




Q 


f^ 






a K 


SNR 


1 1000 

2 1000 

3 1000 

4 1000 

5 1000 


50 


100 
100 

-^ 1000 
100 
100 2 X 


8x IQ- 
8 X 10" 
8 X 10" 
4 X 10" 
10-* -^4 


-4 
-4 
-4 
-4 

X 


5.6 
10-* 


10 10-y -!• 3 X lO-^/lO-** ^ 3 X 10-" 
X 10-* -i> 56 Kopt 

lU ^opt 

10 O.lKopt — > lO^opt 

lU ^opt 


40dB/20dB 

40dB 

40dB 
40dB/20dB 

40dB 
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Fig. 2. Steady-state MSD of LMS and Zq-LMS (with respect to different n). Fig. 3. Steady-state MSD of LMS and Zo-LMS (with respect to different k), 
where the solid square denotes Kopt and SNR is 40dB. where the solid square denotes Kopt and SNR is 20dB. 



means proper zero-point attraction is helpful for sufficiently 
reducing the amplitude of tap- weights in Cq. On the other 
hand, larger n results in more intensity of zero-point attraction 
item and increases the bias of small coefficients C5. Over- 
large K causes too much bias, thus deteriorates the overall 
performance. From ( flTt , Kopt = 3-75 x 10^^ produces the 
minimized steady-state MSD, which is marked with a square 
in Fig. |2] Again, simulation result tallies with analytical value 
well. When SNR is 20dB, referring to Fig. |3] the theoretical 
result also predicts the trend of MSD well. However, since the 
assumptions (v) and (vi) do not hold well in low SNR case, 
the theoretical result has some deviation from the simulation 
result. 

In the second experiment, the effect of parameter a on 
steady-state performance is investigated. Please refer to Fig. |4] 
for results. RZA-LMS is also tested for performance com- 
parison, with its parameter p chosen as optimal values which 
are obtained by experiments. For the sake of simplicity, the 
parameter e in (|7]i is set the same as a. Simulation results 
confirm the validity of the theoretical analysis. With very 
small a, all tap-weights are attracted toward zero-point and 



the steady-state MSD is nearly independent. As a increases, 
there are a number of taps fall in the attraction range while 
the others are out of it. Consequently, the total bias reduces. 
Besides, the results for ZA-LMS are also considered in this 
experiment, with the optimal parameter p proposed in Remark 
5. It is shown that Zq-LMS always yields superior steady-state 
performance than ZA-LMS; moreover, in scenario where a 
approaches 0, the MSD of /o-LMS tends to that of ZA-LMS. 
In the parameter range we have tested, ?o-LMS shows better 
steady-state performance than RZA-LMS. 

The third experiment studies the effect of non-zero coeffi- 
cients number on steady-state deviation. Please refer to Fig. |5] 
It is readily accepted that /q-LMS with optimal k outperforms 
traditional LMS in steady state. The fewer the non-zero 
unknown coefficients are, the more effectively Zq-LMS draws 
tap-weights towards zero. Therefore, the effectiveness of Iq- 
LMS increases with the sparsity of the unknown system. When 
Q exactly equals L, its performance with optimal k already 
attains that of standard LMS, indicating that there is no room 
for performance enhancement of Iq-LMS for a totally non- 
sparse system. 
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Fig. 4. Steady-state MSD of LMS, ZA-LMS, RZA-LMS (with respect to 
different e), and /q-LMS (with respect to different a), where e equals a, p 
and K are chosen as optimal for RZA-LMS and /q-LMS, respectively. 
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Fig. 6. MSD convergence of LMS and /q-LMS (with respect to different 
K), where SNR is 40dB. 
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Fig. 7. MSD convergence of LMS and Jq-LMS (with respect to dilferent 
Fig. 5. Steady-state MSD of LMS and «o-LMS (with respect to different ''). where SNR is 20dB. 
total non-zeros taps Q), where k is chosen as optimal. 



The fourth experiment is designed to investigate conver- 
gence process with respect to k. Also, the learning curve 
of the standard LMS is simulated. When SNR is 40dB, the 
results in Fig. |6] demonstrate that our theoretical analysis of 
convergence process is generally in good accordance with 
simulation. It can be observed that different k results in 
differences in both steady-state MSD and the convergence rate. 
Due to more intense zero-attraction force, larger k results in 
higher convergence rate; but too large k can have bad steady- 
state performance for too much bias of small coefficients. 
Moreover, Zq-LMS outperforms standard LMS in convergence 
rate for all parameters we run, and also surpasses it in steady- 
state performance when k is not too large. When SNR is 20dB, 
Fig. |2] also shows similar trend about how n influences the 
convergence process; however, since the low SNR scenario 
breaks assumptions (v) and (vi), the theoretical results and 
experimental results differ to some extent. 

The fifth experiment demonstrates convergence process for 
various step sizes, with the comparison of LMS and ^q-LMS. 
Please refer to Fig. |8] Similar to traditional LMS, smaller 



step size yields slower convergence rate and less steady-state 
MSD. Therefore, the choice of step size should seek a bal- 
ance between convergence rate and steady-state performance. 
Furthermore, the convergence rate of /g-LMS is much faster 
than that of LMS when their step sizes are identical. 

VL Conclusion 

The complete mean square performance analysis of Zq-LMS 
algorithm is presented in this paper, including both steady- 
state and convergence process. The adaptive filtering taps are 
firstly classified into three categories based on the zero-point 
attraction item, and then analyzed separately. With the help of 
some reasonable assumptions, the steady-state MSD is finally 
deduced and the convergence of instantaneous MSD is approx- 
imately predicted. Moreover, a parameter selection rule is put 
forward to minimize the steady-state MSD and theoretically it 
is shown that ?o-LMS with optimal parameters is superior than 
traditional LMS for sparse system identification. The all-round 
theoretical results are verified in a large range of parameter 
setting through numerical simulations. 
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Fig. 8. MSD convergence of LMS and Jq-LMS with respect to different step 
sizes fi, wliere k is cliosen as optimal for (q-LMS. 



Appendix A 
Expressions of constants 

In order to make the main body simple and focused, the 
expUcit expressions of some constants used in derivations are 
hsted here. 

All through this work, four constants of 

Ai = 2 - (L + 2)^iP, (25) 

Aq = 2 - (0 + 2)fiP, (26) 

Ao = 1 - ^iPc, (27) 

A^ = 2 - ^P, (28) 

are used to simplify the expressions. 

To evaluate the zero-point attracting strength, with respect 
to the sparsity of the unknown system coefficients, two kinds 
of strengthes are defined as 

G(s) = (5(s),5(s))= J29'(.^k), 

keCs 
G'(s) = (s,,9(s)) = ^Sfc.g(sfc), 



(29) 
(30) 



k£Cs 



which are utilized everywhere in this work. Considering the 
attraction range, it can be readily accepted that these strengthes 
are only related to the small coefficients, other than the large 
ones and the zeros. 

In Lemma [T] A — {fly} is defined as 



and 



where 



(i-g)/.2p^2 i_2MP.Ao-yf^Ao 

bn = [bo,n, bl.n] , 



(31) 



(32) 



bo.n =Lfi'P,P, + {L-Q){ 4a' k' - 



/nanujAo 



kHA', 



2A^+i) 



^^Px 



G{s)-2kA^+'G'{s), 



bi^n={L-Q)[p^'PxPv+^a'K' 



'S/TraKOjAo 



where uj is the solution to ( |56] l. 

In Theorem 12] the constants A3 and C3 are 



A3 =Ao, 



2kA 



C3 



(jiPx - 2^^ 



p2 



8 

7r u) 



— Ao 



l^iP^ dct (A3I - A) 
•(KG(s)+/iF,G'(s)). 

In Corollary [H the constants {pi} are 



/3o =m^,A;,AlG(s) +4a2AQ UP.Al 
A^G(s) + 4(L - Q)a^ (fiP, 



AnA 



O^Q 



2A„Aq 
ttAl 



f^^P^AL 



/32- 



4q(L-Q) /Ao/3o 
M^PlAi 



03 ^2fi^P^P^AoAL/0o- 
In Appendix lEl and iDl the constants {iji} are 



Vo 



V2 = 



m 



IGPya^Al 

{L - Q)0a 
AlAq ' 
Gis)A'„AL 



1 



m 



m 



775 = 4a^P^n + 2G(s)i, T]e 



4a^(£-g)AoAQ 

TrAr 



Wa'^L 



ttAj 



(33) 

(34) 

(35) 

(36) 

(37) 
(38) 

(39) 
(40) 
(41) 
(42) 



Appendix B 
Proof of Theorem[T] 

Proof: Denote Dn to be MSD at iteration n, and R„ to 
be the second moment matrix of h„, respectively. 



^71 ^n n 5 



-tVn ^n^n ' 



(43) 
(44) 



Substituting ( fTSI ) into (l44l i. and expanding the term 
x„x5^h„h5^x„xj^ into three second moments using the Gaus- 
sian moment factoring theorem [[36], one knows 

R„+i =(1 - 2/iP,Ao)R„ + fi^P^ ■ tr {R„} I + fi^P^P.I 

+ KAoh„5(wT) + «;Ao5(w„)hT + k^5(w„)5(wT). 

(45) 

Using the fact that Dn = tr{R„}, one has 

D„+i =(1 - nP,AL)Dr, + Lfi^P.,P, 



+ 2AiAohTg(w„) + K'\\g{w„)\\l 

Consequently, the condition needed to ensure convergence is 
|1 — fiPxA^l < 1 and ( fTSl ) is derived directly, which is the 
same as standard LMS and similar with the conclusion in ll27l . 
Next the steady-state MSD will be derived. Using ( |45] l and 
considering the fcth diagonal element, one knows 



"'k,OD 



^i^P;Doo+fi^P^P^ + 2KAahk,oogiwk,oc)+K^g^iwk,oo) 



2/xP^Ao 



(46) 
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To develop hf ^, one should first investigate two items, 

namely /ifc,oo.g(u'fc,oo) and g'^{wk^oo) in (|46]l. For k e Cl, 
from assumption (vi) one knows |wfc,oo| > l/o:, thus 



hk,oogiwk,oo) = g'^{wk,oo) = 0. 



(47) 



For small coefficients, considering assumptions (v) and (vi), 
formula (|4]i implies g{wk,co) is a locally Hnear function with 
slope 2a^, which results in 



giwk.oo 

Thus, it can be shown 



.g(sfe) + 2a'^hk,, 



f'k.oc 



g{wk,oo) 



2a'/iioo + SKO V< 



g^{wk,oc) = ^a'^hl ^ + g^{sk) + 4.a^g{sk)hk, 



(48) 
(49) 



where hk ^oo is derived in (fl4] i. 

Then turning to fc € Cq, it is readily known that /i^ oo = 
u)fc oo in this case. Thus, from assumptions (iv) and (vi), the 
following results can be derived from the property of Gaussian 
distribution. 



^k^oo 



9{wk,. 



2«^/^loo 



2a|/i, 



k.QO I 



2a^hl 



52(wfe,oo) = 4a*7^ - 8a3|/ifc,oo| + 4a2 



(50) 



4«'/iioo - 16"'V^/^2^ + 4"'- 



(51) 

Combining assumption (iii), ( fT4l i and (|47Ti^(|49]l, one can know 
the equivalency between (|46] l and following equations for fc 
in Cl, C5, and Co, respectively. 



2^iP.^Aoh 



2fiP,Aohl , 



fi^P^D^ - ^i^P.P, 



0, 



-K^g\sk){2/^i/P,-l) = Q, 
2/xP^Aoa;^ + 8aKAQUj/V2n 
H^P'^D^ - fi^P:,P, - Aa^K^ = 0, 



fceCi, (52) 
keCs, (53) 
fee Co, (54) 



where ui denotes \ hj, ^,k £Cq for simplicity. Summing 
and ( I53] ) for all fc e Cl IJ^S' and noticing that 

fceCiUCs 
it could be derived that 



D„ 



2{L-Q)Ao 



Ql-iPv 



n^A', 



■G(s), (55) 



Aq Aq ■ Ai2p2A^ 

where G(s) is introduced in (|29] l. Combining ( |55T l and ( |54l i. 
it can be reached that ui is defined by the following equation 

SckkAoAq 



2^j.P^AoAlu}^ 



2tt 



-uj - 2/i^P,P,Ao 



4aV2AQ - k2a[,G(s) = 0. (56) 



Appendix C 
Proof of Corollary[T] 

Proof: By defining 6 — arctan (k/v^), (fTSI l becomes 
I?oo = f^PvL/AL - Ms + 133 ■ /(sin(0))/2, (57) 

where f{x) is defined as 

/3i-/32 , /3i+/?2 



/(^) 



1-x 



1 



.TG(0,1). 



Next we want to find Xopt G (0, 1) which minimizes f{x). 
Forcing the derivative of f{x) with respect to x to be zero, it 
can be obtained that 



< Xopt 



VW+W2~vW^P2 



V/3i +132 + V/3i - /32 



< 1. 



Combining 6'opt = arcsin(xopt) and substituting ^opt in ( l57l ), 
corollary [T] can be finally achieved. ■ 

Appendix D 
Proof of Corollary[3] 

Proof: For a sparse system in accordance with (|T9t , {7?^} 
defined in Appendix |A] are approximated by 

''^ "^ TT^PgAS ' '^^ « ^^ + LAiP,G(s) + 4a fiP^L, 

% ~ ^x— . '74 ~ AlG(s). 

ttAl 

Substituting {77;} in {/3i} of ( |59] l. with the approximated 
expressions above, ( |20] i is finally derived after calculation. 

Next we show -D™" in ( [20] i is monotonic increasing with 
respect to p.. Since ( l20l i is equivalent with 



£»" 



i.^ (^75 + V%^ + ^^G(s)) 



(58) 






it can be directly observed from (I42t that larger yu results in 
larger numerator as well as smaller denominator in (ISFi , which 
both contribute to the fact that -D™'" is monotonic increasing 
with respect to /i. Thus, the proof of Corollary |3] is arrived. 



Appendix E 
Proof of Corollary[2] 

Proof: From dSll, (US), (O, (Elll, and 
obtained that 



Dl 



D 



LMS 



?7o 



Pi 

(L-Qr 



[L-QY 



it can be 



(59) 



Finally, ( fTSl l is achieved after solving the quadratic equation 
above and a series of formula transformation on (|55t . Thus, 
the proof of Theorem [T] is completed. ■ 



Note neither the D]^^ defined in dS]) nor 770 defined in ( |39l ) 
is dependent on Q or G(s), thus the focus of the proof is 
the denominator in (|59l l. In the following, we will analyze 
the two items in the denominator separately and obtain their 
monotonicity. 
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The first item in the denominator is 



/3i 



A' 



^"^ U + g^V ^"y-^" l (60) 



L-Q 



ttAz 



From ( l60l l, it has already shown that /3i/(L — Q)^ is increasing 
with respect to Q and G(s). 

Next we consider the second item. It can be obtained before- 
hand that Pi and ^2 equal i]i (772 + 773 + m) and 2771^772773, 
respectively. Thus, one has 






/r7| + 2774(7^2 + 773) + (?72 -?73)^ ,.,. 
2 ^ly 77 T^TI • (61) 



{L - Qf 



Further notice that 



A' 



m - V3 =f^P.{L - Q) [ 4a2 + ^G(s) ) , 
772 + 773=4(L-Q)aMMP. ' ^"^"^^ 



TtA; 



HP.,{L-Q)^'^ 



^Q 



G(s) 



it can be proved that all of the three items in the square root 
of ( |6T] i are increasing with respect to Q and G(s); thus the 
second item in the denominator is monotonic increasing with 
respect to Q and G(s). Till now, the monotonicity of I?™'" 
with respect to Q and G(s) has been proved. 

Last, in the special scenario where Q exactly equals L, it 
can be obtained that Z?™'" is identical to D^^; thus D^^ 
is larger than the minimum steady-state MSD of the scenario 
where Q is less than L. In sum. Corollary |2] is proved. ■ 

Appendix F 
Relationship with ZA-LMS 

When 2ahi ~ p remains a constant while a approaches zero, 
from ([3]l, (IHi, and (|5]l, it is obvious that the recursion of Iq- 
LMS becomes that of ZA-LMS. Furthermore, one can see that 
g^{x) equals 4a^+o(a'^). From the definition, it can be shown 
Cl is an empty set when a approaching zero. Consequently, 



G(s) = \Cs\ ■ {^o? + o[o?)) = Aa^Q + o{a^). 



(62) 



Combine dSST l, (l56T l, and ( |62] |. then after quite a series of cal- 
culation, the explicit expression of steady-state MSD becomes 

,ZA_ {L-Q)pVf , 2p2(L_Q)AoAQ 



Di 



p2 {pLP, + 2QAo) + Lp?PlP, 



where F is the discriminant of quadratic equation ( |56] l. 



(63) 



A2 A2/^ + WflP^AL^l {p^Q + 1) + fI^P,P„ 



(64) 
Through a series of calculation, it can be proved that (|63) is 
equivalent with (fTOJ i obtained in P6l . Thus, the steady-state 
MSD in ZA-LMS could be regarded as a particular case of 
that in /o-LMS. 



Appendix G 
Proof of Lemma[T] 

Proof: From i45[ . the update formula is 






(65) 



Since LMS algorithm has fast convergence rate, it is rea- 
sonable to suppose most filter tap-weights will get close to 
the corresponding system coefficient very quickly; thus, the 
classification of coefficients s^ could help in the derivation of 
the convergence situation of h^.n- 

For k G Cl, from assumption (vi), (l65T l takes the form 



f^l „+i = (1 - 2m^xAo)/i2 „ + p'-P^D^ + p^P,P,,k e Cl- 

(66) 

For k ^ Cs the mean convergence is firstly derived and then 

the mean square convergence is deduced. Take expectation in 

( fT3l ), and combine assumptions (iii), (v), and (vi), one knows 



hk,n+i = Ao/ife,„ + Kg(sfe), k e Cs- 
Since hk{0) — — Sfe, one can finally get 
^ ^gjskl _ pP.Sk + Kgjsk) 

pPx pPx °' 

Combining ( |65] | , ( |67] | and employing assumption (iii), it can 
be achieved 



(67) 



/^In+l =(1 - 2M^-Ao)/7t„ + A^^P^^i^n + //^P.P. 

+ 2KAo5(sfe)7^+ K2g2(sfe), fc e Cg. (68) 

Next turn to fc G Cq. From assumption (iv), the following 
formula can be attained employing the steady state result and 
first-order Taylor expansion 



/i^ ^, fc G Co, which is the solution to equation 



where w - 

Finally, with assumption (iii) we have 



I 2 



8 aK 



1 - 2pP,Ao - \/ ^ — Ao I hl^^ + p^P^D„ 



^J^^PxPv + 4:a^K^ - \l-aKuAa, k e Cq. (69) 



Considering ri„ — J^kec ^t n' '^^'^ combine (|66]|.(|67]).(|68]). 
and ( |69] l, one can obtain (|22] | after a series of derivation. As 
for the initial value, since wq = 0, by definition we have 
Dq = ||s||2 and Hq = 0. Thus, Lemma [T] is reached. ■ 

Appendix H 
Proof of Theorem[2] 

Proof: The vector b„ in ( |32] | could be denoted as 



b„ 



f 601 A'3' 

Si 



(70) 
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where A3 is defined in ( [33] ) and 600, ^01 , &i are constants. Take 
2;-Transform for (l22t . it can be derived that 



D{z) 
Q.{z) 



{zl- K)-^z 



Do 



{zI-A)-^h{z), 



where z > 1. Then combine the definition of {A;} in Theorem 
12] and the above results, it is further derived 



Di-)-ET 



\,z- 



where Ao = 1 and {ci\ are constants. Take the inverse z- 
Transform and notice the definition of D^o, it finally yields 

D.a^Doo + ciX1+C2X^+c:iXl. 

Thus we have completed the proof of (l24] i. By forcing the 
equivalence between ( 124] | and Lemma [T] the expression of C3 
could be solved as ( l34] i. ■ 



Appendix I 
Proof of Corollary|4] 

Proof: Define function 

p{x) = det |xl - A| , X e K, 

then the roots of p{x) are eigenvalues of matrix A. From OTl i. 
it can be shown 



dot |aooI — A| = dct |aiil — A| 



-aoiflio > 0, 



and 



det 



OOO + Oil 



I A 



^-i V^- 



-Ao <0, 



where {ay} denote the entries of A. Thus, we know p(aii) > 
and p( °""+°" ) < 0, which indicates that one root of 
quadratic function p{x) is within the interval (an, °""+°ii ]. 
Similarly, another root of p{x) is in [ °™^°" ,aoo). Thus, it 
can be concluded that the eigenvalues of A are both in E and 
satisfy 

aii<Ai<^^5^i^<A2<aoo = l-AiP.Ai. (71) 

For large step size scenario of 1 < p{L + 2)Px < 2, ( l33] l and 
(jTB yield 

max{Ai, A2, A3} < 1 - [iPxAl- 



Through comparison between (l24t and (l9), one can know 
for large /i, all the three transient items in MSD convergence 
of /q-LMS has faster attenuation rate than LMS, leading to 
acceleration of convergence rate. ■ 
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